Install this application on your home screen for quick and easy access when you’re on the go.
Just tap then “Add to Home Screen”
Install this application on your home screen for quick and easy access when you’re on the go.
Just tap then “Add to Home Screen”
Member rate £492.50
Non-Member rate £985.00
Save £45 Loyalty discount applied automatically*
Save 5% on each additional course booked
*If you attended our Methods School in Summer/Winter 2024.
Monday 31 July - Friday 4 August
09:00-12:30
Please see Timetable for full details.
The course will introduce participants to the family of methods known as ‘content analysis’ using a variety of examples from political science and other disciplines. The course will cover the basic aspects of content analysis starting with manual content analysis and continuing with an introduction to some of the most popular approaches to computer-assisted content analysis. The course will cover practical aspects of performing content analysis, such as creating coding schemes, selecting documents, assessing inter-coder reliability, scaling, and validating the content analysis output. The course will be taught in a mix of lectures and seminars and participants will have the opportunity to practice on hands-on exercises. The majority of the exercises will be completed by following, step-by-step, code provided in the R statistical software, so previous knowledge of R will not be necessary. In addition, participants will be able to present their own project in class and receive feedback.
Kostas Gemenis is Senior Researcher in Quantitative Methods at the Max Planck Institute for the Study of Societies.
His research interests include measurement in the social sciences, and content analysis with applications to estimating the policy positions of political actors.
He is currently involved in Preference Matcher, a consortium of researchers who collaborate in developing e-literacy tools designed to enhance voter education.
Most social science concepts are not directly observable, content analysis can provide a useful method in which we can measure quantities of interest that are otherwise difficult to estimate. For instance, by content analysing the speeches of legislators, we can classify them as charismatic, populist, authoritarian, liberal, and so on. Similarly, by content analysing the content of newspaper editorials, we can infer whether the media in question were biased in favour of a particular candidate during an election campaign.
Content analysis is therefore typically defined as a method whose goal is to summarize a body of information, often in the form of text, and to make inferences about the actor behind this body of information. This implies that content analysis can be seen as a data reduction method since its goal is to reduce the text material in to more manageable bits of information. Content analysis can be also seen as a method for descriptive inference. Weber (1990, p. 9) for instance, defines content analysis as ‘a method that uses a set of procedures to make valid inferences from text’. The idea is that, by analysing the textual output of an actor, we can infer something about this actor. This conceptualization of content analysis implies that we can use it as a tool for measurement in the social sciences. In this view of content analysis we are concerned with replicability and objectivity, (Neuendorf 2002, pp. 10-15), and therefore we should distinguish content analysis from other approaches/methods such as discourse analysis, rhetorical analysis, constructivism, ethnography and so on.
The course intends to familiarize participants with both manual and computer-assisted content analysis. Following Krippenforff (2004) and Neuendorf (2002), the course will introduce participants to the basic concepts and building blocks in content analysis designs. For instance, the following questions will be addressed and discussed during the course:
The course will focus on both manual and computer-assisted content analysis. For manual content analysis, the course will also look at the, often overlooked, distinction between the analysis of manifest content and judgemental coding. For computer-assisted content analysis, the course will offer an introduction to a variety of popular methods, such as the use of content analysis dictionaries, scaling methods (wordscores, wordfish), and supervised learning approaches. The course will look discuss the relationship between reliability and validity, illustrate methods for estimating inter-coder reliability, and methods for validating the results produced by computer-assisted content analysis.
The course will be taught in a mix of lectures and seminars and participants will have the opportunity to practice on hands-on exercises. The examples used to illustrate the promises as well as the pitfalls of content analysis will be concerned with various applications across the social sciences (e.g. sentiment analysis of the press, frames analysis of social movements, estimating the positions of political actors, agenda-setting in the EU), while the majority of the exercises will be completed by following, step-by-step, code provided in the R statistical software, so that previous knowledge of R will not be necessary. In most of the seminars we will use R Studio. Follow the link for download instructions: https://www.rstudio.com/products/rstudio/download/ In addition, participants will be able to present their own project in class and receive feedback.
Participants are expected to be familiar with basic statistical concepts such as measures of central tendency (mean, median), dispersion (standard deviation), tests of association (Pearson’s r) and inference (χ2, t-test). These material are covered in the first few chapters of introductory statistics or data analysis textbooks. A useful example is Pollock P.H. III, The Essentials of Political Analysis, fourth edition (Washington, DC: CQ Press, 2012), Chapters 2, 3, 6, and 7. Some familiarity with R statistical software is also desirable but not necessary. In most of the seminars we will use R Studio.
Day | Topic | Details |
---|---|---|
Monday | Introduction to content analysis; Manual content analysis I (designing a coding scheme) |
Lecture (90 mins.)
Lecture (90 mins.)
|
Tuesday | Manual Content analysis II (inter-coder reliability) Computer-assisted content analysis I (dictionary method) |
Seminar (90 mins.)
Lecture (90 mins.)
|
Wednesday | Computer-assisted content analysis II (scaling methods) |
Seminar (90 mins.)
Lecture (90 mins.)
|
Thursday | Computer-assisted content analysis III (supervised methods) |
Seminar (90 mins.)
Lecture (90 mins.)
|
Friday | Manual content analysis III (latent coding) |
Seminar (90 mins.)
Lecture (90 mins.)
|
Day | Readings |
---|---|
Monday |
Neuendorf, Kimberly A. (2002) The content analysis guidebook. Thousand Oaks, CA: Sage, Chapter 1 (defining content analysis) Krippendorff, Klaus (2004) Content analysis: An introduction to its methodology, second edition. Thousand Oaks, CA: Sage, Chapters 5 (unitizing) and 7 (coding) Hayes, Andrew F., and Klaus Krippendorff (2007) Answering the call for a standard reliability measure for coding data. Communication Methods and Measures 1: 77–89. |
Tuesday |
Grimmer, Justin, and Brandon M. Stewart (2013) Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis 21: 267–297. Laver, Michael, and John Garry (2000) Estimating policy positions from political texts. American Journal of Political Science 44: 619–634. Young, Lori, and Stuart Soroka (2012) Affective news: The automated coding of sentiment in political texts. Political Communication 29: 205–231. |
Wednesday |
Laver, Michael, Kenneth Benoit, and John Garry (2003) Extracting policy positions from political texts using words as data. American Political Science Review 97: 311–331. Slapin, Jonathan B., and SvenOliver Proksch (2008) A scaling model for estimating time-series party positions from texts. American Journal of Political Science 52: 705–722. Bruinsma, Bastiaan and Kostas Gemenis (2016) Validating Wordscores |
Thursday |
Hopkins, Daniel J., and Gary King (2010) A method of automated nonparametric content analysis for social science. American Journal of Political Science 54: 229-247. |
Friday |
Benoit, Kenneth, Drew Conway, Benjamin E. Lauderdale, Michael Laver, and Slava Mikhaylov (2015) Crowd-sourced text analysis: reproducible and agile production of political data. American Political Science Review 110: 278-295. Gemenis, K. (2015) An iterative expert survey approach for estimating parties’ policy positions. Quality & Quantity, 49: 2291-2306. |
R and R Studio
Yoshikoder, Lexicoder, and Jfreq free software downloads
None
None.
Summer School
Research Designs
A Refresher of Inferential Statistics for Political Scientists
Winter School
Introduction to Statistics for Political and Social Scientists
Winter School
Advanced Quantitative Text Analysis
Python Programming for Social Sciences: Collecting, Analyzing and Presenting Social Media Data